ABSTRACT 

The invention provides a processor that processes bundles of instructions 
preferentially through clusters or execution units according to thread characteristics. The 
cluster architectures of the invention preferably include capability to process "multi- 
threaded" instructions. Selectively, the architecture either (a) processes singly-threaded 
instructions through a single cluster to avoid bypassing and to increase throughput, or (b) 
processes singly-threaded instructions through multiple processes to increase "per thread" 
performance. The architecture may be "configurable" to operate in one of two modes: 
in a "wide" mode of operation, the processor's internal clusters collectively process 
bundled instructions of one thread of a program at the same time; in a "throughput" mode 
of operation, those clusters independently process instruction bundles of separate 
program threads. Clusters are often implemented on a common die, with a core and 
register file per cluster. 


10016663-1 


10 


